Skip to content

Add PCI bus ID as device metadata on Windows#28066

Open
chilo-ms wants to merge 3 commits intomainfrom
chi/add_pci_bus_id_as_metadata_for_windows
Open

Add PCI bus ID as device metadata on Windows#28066
chilo-ms wants to merge 3 commits intomainfrom
chi/add_pci_bus_id_as_metadata_for_windows

Conversation

@chilo-ms
Copy link
Copy Markdown
Contributor

@chilo-ms chilo-ms commented Apr 14, 2026

Description

This PR mainly has two changes:

  1. Add PCI bus ID as device metadata on Windows. (Linux already has this)
  2. Plugin CUDA EP calls cudaDeviceGetByPCIBusId with pci bus id provided to get the cuda device ordinal as the device id (it will be later used for OrtMemoryInfo creation). Even though ORT hardware enumeration order diverges from CUDA-visible ordinal order, plugin CUDA EP makes sure the allocator/memory-info matches the device.

Motivation and Context

The current code in plugin CUDA EP
int current_device_id = cuda_device_index++;
assumes that the Nth NVIDIA GPU in ORT's hardware enumeration order corresponds to CUDA ordinal N. This is fragile because:

  1. ORT enumerates hardware devices via OS-level APIs (e.g., DXGI on Windows, sysfs on Linux), which may order GPUs by PCI bus address.
  2. CUDA orders devices by compute capability (descending), then by PCI bus ID — unless CUDA_DEVICE_ORDER=PCI_BUS_ID is set.
  3. If these orderings diverge (e.g., a less capable GPU has a lower PCI bus address), CUDA ordinal 0 ≠ ORT's first NVIDIA device.

@chilo-ms chilo-ms marked this pull request as ready for review April 14, 2026 22:48
@chilo-ms chilo-ms requested review from tianleiwu and yuslepukhin and removed request for yuslepukhin April 14, 2026 23:09
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Well-motivated fix that addresses a real correctness issue with CUDA device ordinal mapping when OS and CUDA enumeration orders diverge. The PCI bus ID approach is the right solution.

One high-priority concern about the fallback path producing duplicate ordinals when CUDA_VISIBLE_DEVICES restricts GPU visibility.

Positives

  • PCI bus ID resolution via cudaDeviceGetByPCIBusId is the correct, stable way to map physical GPUs to CUDA ordinals
  • Good defensive checks on metadata null, empty string, CUDA API failure, and ordinal range validation
  • Clean parsing of SPDRP_LOCATION_INFORMATION with correct decimal-to-hex conversion
  • Excellent test coverage, especially the PciBusIdRoundTrip format consistency test

Concerns

# Severity Component Issue
1 High cuda_ep_factory.cc Fallback ordinal collision when PCI bus ID present but CUDA resolution fails
2 Suggestion device_discovery.cc PCI domain hardcoded to 0000
3 Nitpick test file Missing trailing newline

See inline comments for details.

Comment on lines +198 to 220
// Try to resolve the CUDA ordinal from pci_bus_id metadata if available.
// This is more reliable than counter-based ordinal assignment because it is
// not affected by enumeration order, CUDA_VISIBLE_DEVICES remapping, or
// mixed-vendor GPU configurations.
int current_device_id = -1;
const OrtKeyValuePairs* metadata = factory->ort_api_.HardwareDevice_Metadata(&device);
if (metadata != nullptr) {
const char* pci_bus_id = factory->ort_api_.GetKeyValue(metadata, "pci_bus_id");
if (pci_bus_id != nullptr && pci_bus_id[0] != '\0') {
int resolved_ordinal = -1;
cudaError_t err = cudaDeviceGetByPCIBusId(&resolved_ordinal, pci_bus_id);
if (err == cudaSuccess && resolved_ordinal >= 0 && resolved_ordinal < cuda_device_count) {
current_device_id = resolved_ordinal;
}
}
}

// Fallback: if pci_bus_id was not available, use counter-based ordinal assignment.
if (current_device_id < 0) {
current_device_id = cuda_device_index++;
}

// Validate the assigned ordinal is within the range of CUDA-visible devices.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — Fallback ordinal collision with CUDA_VISIBLE_DEVICES

When pci_bus_id is present but cudaDeviceGetByPCIBusId fails (e.g., device hidden by CUDA_VISIBLE_DEVICES), the code falls back to cuda_device_index++. This counter-based ordinal can collide with a PCI-resolved ordinal from another visible device.

Scenario (CUDA_VISIBLE_DEVICES=1, two NVIDIA GPUs):

  • cuda_device_count = 1 (one visible device, remapped to ordinal 0)
  • GPU0 (PCI 01:00.0): cudaDeviceGetByPCIBusIdcudaErrorInvalidDevice → fallback → ordinal 0
  • GPU1 (PCI 65:00.0): cudaDeviceGetByPCIBusId → ordinal 0 (the visible device)
  • Result: Both GPUs assigned ordinal 0

Suggested fix: When pci_bus_id IS present but CUDA resolution fails, skip the device (it's not CUDA-visible) rather than falling back to the counter:

if (pci_bus_id != nullptr && pci_bus_id[0] != '\0') {
  int resolved_ordinal = -1;
  cudaError_t err = cudaDeviceGetByPCIBusId(&resolved_ordinal, pci_bus_id);
  if (err == cudaSuccess && resolved_ordinal >= 0 && resolved_ordinal < cuda_device_count) {
    current_device_id = resolved_ordinal;
  } else {
    // Device has a PCI bus ID but is not visible to CUDA
    // (e.g., filtered by CUDA_VISIBLE_DEVICES). Skip it.
    continue;
  }
}

}
}

// Generate telemetry event to log the GPU and NPU driver name and version.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion — PCI domain hardcoded to 0000

The domain is hardcoded to 0000 in the format string. While this is correct for the vast majority of systems, multi-segment PCI configurations (non-zero domain) exist in some enterprise server hardware. The Linux implementation extracts the actual domain from sysfs.

Consider adding a brief comment noting this limitation (e.g., // Domain is 0000 for standard single-segment PCI; multi-segment configs are not supported), or exploring DEVPKEY_Device_LocationPaths which may contain the full domain.

} // namespace test
} // namespace onnxruntime

#endif // defined(ORT_UNIT_TEST_HAS_CUDA_PLUGIN_EP) No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Missing trailing newline at end of file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants